910 DMSs was the main contributor to each of 1,250 DEGs.

ds, it was aimed to investigate whether the top-ranked DMS in a

n model for a DEG was a local or a remote methylation site in the

Based on the information, a statistical analysis was carried out to

te the trend of the methylation-to-expression interplay pattern,

g the local or remote interplay.

ose ܠ was used to represent a differential expression vector of

EG, which was the target (dependent) variable of a regression

n such a model, 910 DMSs were treated as the regressors or the

ent variables. Moreover, w was used to represent a vector of

n coefficients or model parameters for 910 regressors and M was

epresent a matrix of the differential methylation ratios of 910

The M matrix has 114 rows and 910 columns. The regression

r the gth DEG was defined as below,

ܠൌ݂ሺۻ, ܟሻ



n such a regression model, f was designed as a regression function,

as either linear or nonlinear. In a constrained (such as RLR) linear

n model, the above equation was simplified as below,

ܠൌۻܟ൅ߣܟܟ



were 1,250 such regression models for 1,250 DEGs. These

were designed to investigate how DMSs, which were from either

emote methylation sites, contributed to the differential expression

attern) of the gth DEG. Only the top-ranked DMSs were used for

ct analysis in this chapter. If a remote DMS was ranked at the top,

nce between this remote DMS and the gth DEG was then

ted. The Lasso, RLR, SVM and random forest were used to

e relationship between the variables (between the gth DEG and all

Ss) and rank variables. Figure 4.29 shows the R-square

ments for four models. It can be seen that four models all fitted to

well.